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Redundancies in Large-scale Protein Interaction Networks 
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Abstract 

Understanding functional associations among genes discovered in sequencing projects is a key issue in 



post-genomic biology 
been difficult 



M- However, reliable interpretation of the protein interaction data has 
. In this work, we show that if two proteins share significantly larger number of common 
interaction partners than random, they have close functional associations. Analysis of publicly available 
data from Saccharomyces cerevisiae reveals more than 2800 reliable functional associations, 29% of which 
involve at least one unannotated protein. By further analyzing these associations, we derive tentative func- 
tions for 81 unannotated proteins with high certainty. 



A large number of genes discovered in sequencing projects remain functionally unannotated, 
motivating significant research in post-genomic biology. High-throughput experiments such as 
genome-wide monitoring of mRNA expressions as well as protein-protein interaction networks 
are expected to be fertile sources of information to derive their functions 
However, a high rate of false positives Isl S as well as the sheer volume of the data are making 
reliable interpretation of these experiments difficult. 

In this work, we are able to overcome these difficulties by using a statistical method that forms 
reliable functional associations between proteins from noisy genome-wide interaction data. Our 
method ranks the statistical significance of forming shared partnerships for all protein pairs in the 
interaction network and shows that if two proteins share significantly larger number of common 
partners than random, they have close functional associations. In the supplement, we derive more 
than 2800 pairs of high quality associations for 5*. cerevisiae involving 852 proteins. The method 
is not overly sensitive from the false positives widely present in the two-hybrid data. Even after 
adding 50% randomly generated interactions to the measured dataset, we are able to recover almost 
all (~ 90%) of the original associations. The modular nature of the interaction network lioi is 
revealed by the clustering of these associations. From the derived modules, we are able to predict 
functions for 8 1 unannotated proteins with high certainty. It has been an encouraging sign that the 
functions of some of these proteins were recently annotated by the SGD database 111 ill from other 
sources after the completion of our work, and all but one (22 out of 23) of our predictions proved 
to be correct. 

Our strategy of assigning statistical significance is to compare the measured protein interaction 
network with a random network of the same size [ibI [l^. The deviation of the measured 
network from randomness is presumed to reflect its biological significance. Non-random nature of 
the large-scale protein interaction network has been discussed in earlier work In one 

example, it was observed that the connectivities of the proteins in the measured interaction net- 
works closely followed apower-law distribution instead of the exponential distribution expected 
from random networks ||9l [121 [isl ll5|]. Useful biological prediction regarding the lethality of 
the null rnutants lacking those highly connected proteins could be made from such non-random 
behavior [12]. 

We hypothesize that if two proteins have significantly larger number of common interaction 
partners in the measured data-set than what is expected from a random network, it would suggest 
close functional links between them. To validate this hypothesis, we rank all possible protein pairs 
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FIG. 1: Probabilities of associations for all possible protein pairs derived using our method. Solid black line: 
measured protein interaction network jnj]; broken red line: a random network of similar size constructed 
by connecting randomly chosen nodes; dotted green line: a random network constructed from the measured 
network keeping its power-law connectivity property unchanged lisi . The probabilities of associations for 
the measured network are up to 40 orders of magnitude lower than the random networks. 

in the order of their probabilities 1^1^ for having the experimentally measured number of common 
interaction partners. If the computed probability is extremely small, it signifies that the chosen 
protein pair has an unusually large number of common partners. Such pairs are considered for 
further analysis, as we discuss in the paper. 

The described method is applied on the available experimental data from budding yeast (S. 
cerevisiae) collected in the DIP database from several sources In Fig. [0 we show a plot 

with probabilities for all protein pairs in the network sorted in increasing order. For comparison, 
we also show corresponding probabilities for a random network of similar size, as well as a ran- 
domized version of the measured network. The random network has the same number of nodes 
and edges as the measured network, but the connections are made from a uniform distribution. 
The randomization of the experimental network is done using a method similar to Ref. I13I . The 
method allows us to maintain the power-law nature of the network. As we observe from the plot, 
the probabilities of some of the associations in the measured network are up to 40 orders of mag- 
nitude lower than both of the randomly constructed networks. Therefore, it is safe to conclude that 
those associations are not artifacts due to experimental noise, but contain biologically meaningful 
information. It is also clear from the plot that such low probability associations did not arise from 
the scale-free nature of the network 1I12I1 . 



To understand what biological information is provided by such low-probability pairs, we in- 
spect all pairs with probabilities below a cutoff value of 10-8 iQ]. The detailed list is provided 
as a supplement |20] as well as from our website |2l|. The group consists of 2833 protein pairs 
involving 852 proteins. A strong functional link is observed among proteins in these pairs thus 
validating our hypothesis. This is illustrated in Table HI where we present the ten pairs with the 
lowest probabilities. As we can see from the table, both proteins usually either belong to the same 
complex or are parts of the same functional pathway. Same trend is generally true for the larger 
dataset presented in the supplement. By manually inspecting the top 100 pairs, we found that in 
over 95% of them both proteins have similar function. 

We can take advantage of the above observation to predict the functions of the unannotated 
proteins. About 29% of the 2833 chosen pairs contain at least one unannotated protein l^]- 
To assign a function to any one of them, we determine the other proteins with which it forms 
associations. As an example, in Table HH we show that the unannotated protein YKL059C shares 
partners with many proteins involved in transcription. Therefore, it is most likely also involved 
in transcription. Moreover, from the low probabilities of associations with CFT2 and CFTl, we 
strongly suspect that that YKL059C is involved in pre-mRNA 3' end processing. This is further 
confirmed by the clustering method that we present below. Our website provides an interactive 
tool for users to search for the close associates of any query protein and thus derive its putative 
function [21]. 

Since functionally related proteins form strong associations with each other, this can be used 
as the basis for an algorithm to cluster them into functional modules. We derive 202 modules 
[Fig. El from the associations and then compare the annotations of constituent proteins. 163 of the 
derived modules have all proteins annotated in the SGD database llli and we find 149 of them 
(about 92%) to have all members of the module from the same functional complex or pathway. 
Therefore, if an unannotated protein belongs to the same modules with other proteins of known 
functions, we can predict its functions to be the same as the other ones with high confidence. By 
analyzing the derived modules, we predict functions for 81 unannotated proteins and present them 
in Table Iml 

We note that the chosen cut-off value (lO"®) is not a sharp threshold. As the number is in- 
creased, the amount of biologically meaningful information degrades gradually. In the case of the 
modules, their numbers and sizes increase with increasing cut-off. As an example, for the well- 
studied mediator complex shown in figure |2ta), as we increase the cut-off value, more proteins 



Protein 1 


Protein 2 


LoefD) 


Function 


MY03 


MY05 


-47.41 


Class T mvosins 


R0X3 


SRB6 


-46.12 


Mediator complex 


KRRl 


PWP2 


-45.50 


snoRNA romnlpx 


R0X3 


MED 2 


-44.94 


IVTpHifitnr romnlpx 


MED2 


SRB6 


-42.19 


IVTpHifitnr rnmnlpx 


ATPl 


ATP2 


-42.17 


ATP complex 


KAP95 


SRPl 


-41.25 


Protein import-export 


PREl 


RPNIO 


-40.58 


Spliceosome complex 


YKR081C 


YNLUOC 


-40.33 


Both unannotated 


RPTl 


RPN6 


-40.07 


Spliceosome complex 



TABLE I: The ten protein pairs with the lowest probabilities llal based on our method, along with their 
functions. We find both of the proteins in these pairs to belong to either the same complexes or the same 
functional pathways. The complete list is provided as a supplement. 

known to be part of the complex come together. We find that even with cut-off as high as 2 x 10"^, 
the proteins included in the mediator module are genuinely related to the complex. In our website 
we present an interactive program that allows users to choose different cut-off values and obtain 
the corresponding modules. Among the additional modules derived with higher threshold, we 
find two that contain mostly unannotated proteins and therefore are possibly large complexes not 
yet well studied by experimentalists. One of them is suspected to be involved in actin cytoskele- 
ton organization and protein vacuolar targeting and the other one in splicing, rRNA processing 
and snoRNA processing. We present them in Figs. 13 and HI expecting their identification to spur 
additional interest among yeast biologists. 

The method presented here has several advantages. Firstly, it is not sensitive to random false 
positives. To illustrate, we added connections randomly increasing the average number of inter- 
actions by 50% and were still able to recover 90% of the top 2833 associations. Secondly, the 
method is not biased by the number of partners a protein has. As an example, JSNl, a nuclear 
pore protein, has the largest number of interactions in the measured dataset, but none of the 2833 
associations derived by our method contains JSNl. Among the drawbacks, our method cannot 
extract much information about proteins with none or very few interactions in the dataset. 



Associations of YKL059C 


Log(p) 


CFT2[T] 


-32.430607 


CFT1[T] 


-30.151475 


YSH1[T] 


-28.320081 


PTA1[T] 


-27.843331 


PAP1[T] 


-27.410048 


REF2[T] 


-25.048611 


PFS2[T] 


-24.638901 


YTH1[T] 


-23.247919 


FIP1[T] 


-21.609526 


HCA4[T] 


-21.285573 


YGR156W[U] 


-17.961537 


RNA14[T] 


-17.732432 


SWD2[U] 


-14.407007 


GLC7[C] 


-13.284243 


YOR179C[T] 


-12.636400 


PCF11[T] 


-8.857110 



TABLE II: Categories - T: transcription, U: unannotated protein, C: cellular fate/organization. Most of the 
associations of YKL059C are involved in transcription and therefore it is also expected to do the same. From 
its very low probabilities ll6i of associating with CFTl and CFT2, it is strongly suspected to be involved in 
pre-mRNA 3' end processing. Our website provides an interactive tool to search for the associates of any 
protein 

In conclusion, we derived functional modules and reliably predicted functions of unannotated 
proteins from the existence of abnormally large number of shared interaction partners in the 
protein-protein interaction network. We believe the real power of the method will be in study- 
ing the higher eukaryotes, where higher fraction of genes has unknown functions. Moreover, the 
method is applicable to other forms of networks, such as the Internet, metabolic networks, social 
networks and predator-prey networks. 
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SITS 

SWPl 
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0ST5 
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ARC40 



ATP17 

ATP5 
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ATP2 

ATPl 8 

ATP6 

ATP7 
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FIG. 2: Functional modules obtained by clustering the low-probability associations using an algorithm 
described in our paper. All proteins from each of these derived modules belong to same functional com- 
plexes, (a) Pol n transcription mediator complex, (b) chaperon ring complex, (c) nuclear pore complex, (d) 
oligosaccharyl transferase complex, (e) Arp2/3 complex, (f) ATP synthase complex. The complete list of 
modules is provided in the supplemental table. 

[6] H. Zhu et al. Science, 293:2101, 2001. 

[7] B. Grunenfelder and E. A. Winzeler. Nature, 3:653, 2002. 

[8] P. Uetz and R. Hughes. Curr. Opinion in Microbiol, 3:304, 2000. 

[9] C. V. Mering et al. Nature, 417:399, 2002. 

[10] L. H. Hartwell, J. J. Hopfield, S. Liebler, and A. W. Murray. Nature, 402:C47, 1999. 

[11] J. M. Cherry et al Nucleic Acids Res., 26:73, 1998. 

[12] H. Jeong, S. R Mason, A.-L. Barabasi, and Z. N. Oltvai. Nature, 41 1:41, 2001. 

[13] S. Maslov and K. Sneppen. Science, 296:910, 2002. 

[14] H. Jeong, B. Tambor, R. Albert, Z. N. Oltvai, and A.-L. Barabasi. Nature, 407:651, 2000. 

[15] G. D. Bader and C. W. V. Hogue. Nature biotech., 20:991, 2002. 



YGR128C 

DIP2 

YKR060W 

YDR449C 

MPPIO 

YLR409C 

YDR324C 

NANl 

BMSl 

SIKl 

YMR093W 

YPR144C 

IMP3 

YDL148C 

YLR186W 

ECM16 

YIL109C 

KRE33 

NOPl 

YBL004W 

BFR2 

YGR145W 

YJL069C 

KRRl 

PWP2 

YGR090W 

YER082C 

ENPl 

ROKl 



FIG. 3: A module identified by our method consisting of proteins presumably involved in assembly and 
maintenance of small nucleolar ribosomal complex. 

[16] In a random network of N proteins, the probability that two proteins with ni and n2 partners sharing 
m common partners is expressed as 

ni \ / N — n\ 
n2 — m 



m 



P{N, ni,n2,m) 




where 



N 



Nl 



We start with the experimentally measured dataset of N proteins and 



m 



m\{N-my. ' 

compute the probabilities for all possible N{N — l)/2 pairs based on n\, n2 and m obtained from the 
experimental data. Detailed derivation of the expression is given in the supplement. 
[17] C. M. Deane, L. Salwinski, loannis Xenarios, and D. Eisenberg. Molecular and Cellular Proteomics, 
1.5:349, 2002. 
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ACF2 

YJR083C 

YMR192W 

YNL094W 
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SYSl 

YBR098W 
YPR171W 
YGR268C 



^— YOR284W 



GTSl 
GTSl 
YMR253C 
YJL151C 
YPL246C 
KTR3 
YKR030W 
YLR064W 

FIG. 4: A module identified by our metiiod consisting of proteins presumably involved in actin cytoskeleton 
organization and protein vacuolar transport. 

[18] We used 09/01/2002 update of the DIP dataset containing 14871 interactions for 4692 proteins. 

[19] Since the dataset contains = 4692 proteins, l/N"^ --^ 10^^ is a reasonable cutoff. The number is 
validated by more rigorous comparison with random network shown in Fig. ^ However, this is not a 
sharp threshold as we discuss in more detail in the paper. Therefore, we present pairs up to 2 x 10"'* 
in the supplement. 
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Protein 


Predicted Function 


YrRU24C (YSCsS), BZZl |23|, YNLU94W (APPl), YMR192W (APP2) 


Actin filament organization 


YCjR26bC (HUAl), YOK284W (HUA2), YPR171W (BSPl) 


Actin patcli assembly 


YJROsjC (ACP4) 


Actin cytoskeleton organization and biogenesis 


YDKUJoL (MKrj) 


Protein biosynthesis in mitochondrial small ribosomal subunit 


YKL214C (YRA2) |23| 


mRNA processing/]<NA metabolism 


YNL207W (R102) 


Nucleolar protein involved in 40S ribosomal biogenesis 


YER082C(UTP7) |231, YJL()69C(UTP18) 1 23], ENPl 


Associated with U3 snoRNA and 20S rRNA biosynthesis 


YMR288W (HSH155) I23| 


M A 1 J' 1 J " M A 1' ' 

snRNA binding involved in mRNA splicing 


YHR197W (1P12), YNL182C (lPi3), YLRlUoC (MDNl) |231 


T1'l_ 11 11 J 'i- 

Ribosomal large subunit assembly and maintenance 


Y<jR12oC (UlPs) |23] 


Processing ot 2US pre-rRNA 


YCjR215W (RSM27) |23J, Y(jL129C (RSM13) |23J 


structural constituent ot nbosome 


YDL213C (NOP6) 


T^ATA ' /a- ' t-' 1 4-" 

rRNA processing/transcription elongation 


YNL306W (MRPS18) |23| 


A 1 J ' 1 11 "1_ 1 I_ 

Mitochondrial small ribosomal subunit 


^FMnn YTT 1 OQr I'TTTPl 0^ 1 9^1 YRT OfUW i'TTTP9n^ 


snoRNA binding, 35S primary transcript processing 


YGL099W (LSGl) [23], YDRIOIC (ARXl) 


27S pre-rRNA ribosomal subunit 


BRXl, YOR206W (NOC2), FPRl 


Bioaenesis and transport of ribosome 


YOR145C (DI]yi2) 


35S Primary transcript processing and rRNA modification 


YELOISW (DCP3) 


Deadenylation dependent decapping and mRNA catabolism 


NHPIO. RFXl [23] 


Modification of chromatin architecture/transcription 


YDR469W (SDCl) \23] 


Chromatin silencing and histone methylation 


YPL070W (MUKl) 


Transcription factor (or its carrier) 


YLR427W (]y[AG2) 


DNA N-glycosylase involved in DNA dealkylation 


YDL076C (RXT3), YILl 12W (HOS4) 


Histone deacetylase complex involved in chromatin silencing 


ISTl 


Trancription initiation factor 


HCRl |23] 


Translation initiation as part or elF3 complex 


YDL074C (BRhl) 


Chromosome condensation and segregation process 


YCjRISdW (PI 1 1) |23], YKLU59C (MPhl) |23] 


T^ATAl J 1 J 1*-" * ilL * 1- -C *- 

mRNA cleavage and polyadenylation specincity lactor 


■\7"/^ T» /"^tJ T /A TATTOO \ 

YCjRU89W (NNP2) 


Chromosome segregation (spindle pole) and mitosis 


YuLlolC(YiP5) ,YGL198W (Y1P4) 


Vescicle mediated transport 


1 JVIxLfjUVV ^iVilJvJi ) 


Cell wall synthesis / protein-vacuolar targeting 


YBR098W (]yi]y[S4) 


Golgi to endosome transport and vescicle organization 


YHR105W (YPT35) 


Golgi to vacuolar transport 


YBL()49W (IVIOHl), YCL039W (MOWl) 


Both same function. Possibly linked with vacuolar transport 


YDL246C (SOR2) 


Possibly involved in fructose and mannose metabolism 


YIVIR322C (SN04) 


Pyridoxine metabolism 


YDR430C (CYlVIl) 


Protein involved in pyurvate metabolism 


YJL199C (IVIBBl). YPL004C (LSPl), YGR086C (PILl) 


Metabolic protein 


YLR097C (HRT3) 


Nuclear ubiquitine ligase 


YKR046C (PET 10) 


ATP/ADP exchange 


YELOnW (GTT3) 


Protein hnked with glutathione metabolism 


ITCl 


Chromatin remodeling 


YGR161C (RTS3) 


Protein phosphatase 2A complex 


EFDl 


DNA replication and repair 


Y]y[L117W(NAB6) 


Nuclear RNA binding 


YLR432W (mm) 


RNA hehcase involved in mRNA splicing 


YJU2, YGR278W (CWC22), YDL209C (CWC2) [231 


Spliceosome complex involved in mRNA splicing 


YGR232W (NAS6) L23|, YGL004C (RPN14) |2J, YLktsiC 
(RPN13) |23] 


Proteasome complex 



TABLE III: Predicted functions of previously unannotated proteins. 
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FIG. 5: In the above interaction network, both proteins A and B have 4 partners (ni and n2). Two of the 
partners (marked by dark circles) are shared by both of them. We compute the probability for such an event 
to occur in a random network. If the computed probability is low, perhaps two proteins are redundant in 
their functions. 

APPENDIX A: SUPPLEMENTAL INFORMATIONS (THEORY) 
1. Methods 

a. Mathematical Expression for Probability 

In a network of N proteins, the probability that two proteins with rii and n2 partners [Fig. 13 
share m common partners is given by 



P{N,nl,n2,m) = 




{N - niy.{N - n2)\ni\n2\ 
N\m\{ni — m)\{n2 — m)\{N — ni — ^2 + m)\ 



The above expression is symmetric with respect to interchange of rii and n2. Eg. lA II is derived in 
the following manner. It is a ratio where the denominator is the total number of ways two proteins 
can have ni and n2 partners given by 




(A2) 



11 



(1,4) 




FIG. 6: In our clustering algorithm, we start with a matrix with p-values for all pairs. If the element (m, n) 
has the lowest p-value, a cluster is formed with proteins m and n. Therefore, rows/columns m and n are 
merged with new p value of the merged row/column as geometric mean of the separate p values of the 
corresponding elements. 

whereas the numerator is the number of cases among them where m of those partners are common 
to both of them. It is expressed as 




(A3) 



The numerator can be derived using the following argument. In the combinatorial product on the 
left hand side, the first term represents the number of ways m common partners can be chosen 
from all proteins. For the first protein, we choose ni — m remaining partners out of remaining 
N — m proteins. This is the second term in the product. Subsequently for the second protein, we 
choose n2 — m remaining partners none of which match any of rii partners of the first protein. 
This contributes to the third term. 

For the calculations in our paper, the results are approximately the same, whether we compute 
the probabilities for pairs with exactly m common partners or we compute for m or more partners. 
It can be checked from the expression of probability in Eq. lAll that probability terms for increasing 
m fall inversely with A^. Since for our case is about 5000, the additional terms in the probability 
expression are negligible. 
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Protein 1 


Protein 2 


Index 


MY03 


MY05 


1 


GICl 


GIC2 


72 


TIF4632 


TIF4631 


145 


NUPlOO 


NUP116 


476 


HSC82 


HSP82 


485 


ZDSl 


ZDS2 


564 


PPH21 


PPH22 


579 


KCC4 


GIN4 


606 


RFC3 


RFC4 


634 


CLNl 


CLN2 


918 


GSP2 


GSPl 


1288 


YPT32 


YPT31 


1550 


BOIl 


BOI2 


1640 


SEC4 


YPT7 


1785 


ir 1 j3 


VPS 21 


1 000 


BMHl 


BMH2 


1920 


PCL7 


PCL6 


1926 


YGROlOW 


YLR328W 


2162 


MY04 


MY02 


2474 


SAP190 


SAP 185 


2721 


MKKl 


MKK2 


2725 


IMD4 


YLR432W 


2746 



TABLE IV: Associations derived by us which were also ancient paralogs according to Ref. \24]. Third col- 
umn in the table represents the indices for the pairs in the list of associations sorted according to increasing 
probabilities. The Ust is also available as a supplementary material. 

b. Clustering Technique 

Our clustering method is as follows. We compute p values for all possible protein pairs and 
store them in a matrix. Then we pick the protein pair with lowest p value and choose it as the 
first group in the cluster. The rows and columns for these two proteins are merged into one row 
and one column [Fig.l^. Probability numbers for this new group are geometric means of the two 
probabilities [or arithmetic means of the log(p) values]. The process is continued repeatedly, thus 
adding more and more clusters as well as making the existing ones bigger, until a threshold is 
reached. 



2. Ancient Paralogs 



Ref. 112411 proposed possibility of duplication of the entire yeast genome in some distant past 
and presented a list of genes that were identical or matched closely due to this event. We check 
how many of the associations derived by us were also such ancient paralogs and present them in 
Table. |lVl We find 22 such ancient paralogs among the list of top 2833 pairs (.7%). Therefore, 
these are the ancient paralogs that maintained their functions over time. 
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APPENDIX B: SUPPLEMENTAL INFORMATIONS (FUNCTIONAL MODULES DERIVED 
USING OUR TECHNIQUE) 
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MRP4 

MRPS5 

YDR036C 



MRP4 and MRPS5 are structural constituents of mitochondrial small 
ribosomal subunit involved in protein biosynthesis. Therefore, yet un- 
aimotated YDR036C (MRP5) is strongly suspected to have similar 
function. Their functional link can be checked by double or triple 
deletion experiments. 



GALll 

ROX3 

SRB6 

MED2 

MED7 



Parts of RNA polymerase II transcription mediator complex involved 
in transcription from Pol 11 promoter. 



CDC53 
SKPl 



Parts of nuclear ubiquitin ligase complex involved in ubiquitin-protein 
hgase during Gl/S and G2/M transitions of mitotic cell cycle. 



BET3 
TRS20 



Parts of TRAPP complex involved in targeting and fusion of ER to 
Golgi vesicles. 



CKA2 
CKBl 
CKAl 
CKB2 



Alpha and beta subunits of casein kinase II complex involved in 
regulation of several cellular processes. 



PRP45 

CEFl 

YJU2 



PRP45 and CEFl are involved in pre-mRNA splicing with the 
sphceosome. Therefore, yet un-armotated YJU2 is strongly suspected 
to have similar function. 



YGL198W 
YGL161C 
YIFl 
GDIl 



GDIl is a RAB GDP-dissociation inhibitor involved in vesicle- 
mediate transport, whereas YIFl is part of COPII-coated vesicle 
involved in ER to Golgi transport. Therefore, yet unaimotated 
YGL161C (YIPS) and YGL198W (YIP4) are strongly suspected to be 
involved in similar cellular functions. YGL161C and YGL198W are 
not homologous to remaining two proteins. 





CCT5 

' CCT2 

' TCPl 

^ CCT6 


Parts of chaperone ring complex located in the cytoplasm and assisting 
in protein folding. 




DTPtI 

Din? 


Both MAP kinase-associated proteins involved in down-regulation of 
invasive growth. It is known that each of them is viable, but double 
deletion results in loss of function. 




RVS161 


Parts of actin cortical patch hnked with endocytosis. Close functional 
hnk is well known. 




ZDSl 
ZDS? 


Both control cell-polarity, although ZDS 1 is located in bud-tip and 
ZDS2 at the nucleus. It is known that each gene is viable but double 
deletion affects cell-cycle progression. They are also ancient paralogs. 








1 KAP95 

' SRPl 

TEMl 


KAP95 is part of nuclear pore complex involved in protein nuclear 
import. SRPl is part of nuclear pore complex involved in nucleo- 
cytoplasmic transport. TEMl is a GTP-binding protein involved in 
termination of M-phase of cell cycle. Link is not clear to us. These 
four protein also have strong association with GCD7, a translation 
initiation factor. We suspect YPL070W (MUKl) is either a 
transcription factor or involved in transporting of transcription factor. 




I ri^u / u w 




MAHI 
YT,R477W 


MAGI is an alkyl-base DNA N-glycosylase involved in DNA 
dealkylation. Therefore, YLR427W (MAG2) is suspected to have 
similar function. 








1 RPC34 

' RP031 

RPC25 

RPC40 


Parts of Pol rn complex involved in transcription from Pol in 
promoter. In addition, RPC40 is also part of Pol I complex. 





YNL207W 


TSRl is a nucleolar protein involved in 40S ribosomal biogenesis. 
Therefore, yet unannotated YNL207W (RI02) possibly have similar 
function. Moreover, both of them share common partners with ENPl, 
a nuclear protein linked with cell-growth and maintenance. 




MCM? 


Both ATP-dependent helicases involved in DNA-rephcation initiation, 
DNA unwinding and pre-replicative complex formation. 






1 STT3 

' SWPl 

1 0ST2 

' 0ST3 

' 0ST4 

0ST5 

1 WBPl 

' OSTl 


Parts of oligosaccharyl transferase complex involved in N-hnked 
glycosylation. 












STTT1 


sun is part of translation initiation factor whereas PRP6 is a pre- 
mRNA splicing factor. 
















1 MRPLIO 

' MRPL9 

MRPL16 

MRPL19 


Mitochondrial ribosomal protein of the large subunit. 




APPtI? 


APG12 is a membrane-located protein involved in autophagy and 
protein- vacuolar targeting, whereas YGR262C (BUD32) is a protein 
serine/threonine kinase linked with bud site selection. We suspect they 
have closer functional link. Both of them share many partners with the 
ubiquitin complex. 


HSM3 


HSM3 is involved in DNA mismatch repair pathway, whereas RAD23 
is involved in nucleotide-excision repair and DNA damage 
recognition. 



CLNl 
CLN2 



Regulation of cell-cycle controlling the START point. They are also 
ancient paralogs. 



SORl 
YDL246C 



SORl is involved in fructose and mannose metabolism. Therefore, yet 
unannotated YDL246C (SOR2) also possibly a metabolic protein. 



MCDl 
IRRl 



Parts of cohesion complex involved in mitotic sister chromatid 
cohesion. 



NUP84 

NUP133 

NUP85 

NUP145 

NUP120 



Forms a nuclear pore complex importing-exporting materials between 
nucleus and cytoplasm. 



SLA2 
ABPl 



Proteins involved in organizing actin filaments for cell polarization 
and endocytosis. 



TIF4632 
TIF4631 
CDC33 



Translation initiation factor. TIF4632 and TIF4631 are ancient 
paralogs. 



SIR4 



SIR3 



Regulators of silencing at HML, HMR, and telomeres. 



APC4 
CDC16 
APCU 
APC5 
CDC23 
APC2 
CDC27 
APCl 
DOCl 
CDC26 
APC9 



Anaphase-promoting complex involved in mitotic metaphase anaphase 
transition. 



PRP3 

PRP31 

PRP4 

SNU114 

SMBl 

SNU66 



Involved in pre-mRNA splicing. 



Involved in cytoskeleton organization and biogenesis. 



SRV2 
ACTl 



KRRl 

PWP2 

YGR090W 

YER082C 

ENPl 

YJL069C 



Linked with snoRNA complex involved in processing of 
20S pre-rRNA. 



SEHl is a nuclear-pore protein involved in import-export between 
nucleus and cytoplasm. SEC 13 is a protein involved in release of 
transport vesicles from the ER nuclear pore complex suburut. 



SEHl 
SEC 13 



RFXl 

NHP6B 

NHPIO 



NHP6B is a chromatin binding protein involved in estabUshment 
and maintenance of chromatin architecture as well as regulation of 
transcription from Pol II and Pol III promoters. Therefore, other two 
are strongly suspected to have similar function. We note that RFXl 
has DNA binding domains. 



ORC6 
ORC4 
ORC2 
ORC5 
ORC3 
CDC6 
ORCl 



Origin recognition complex involved in DNA replication. 



ISW2 
VPSl 
MOTl 



ISW2 is an ATPase linked with chromatin modeling. MOTl is an 
ATPase regulating transcription from Pol II promoter. VPS 1 is a 
GTPase involved in protein-vacuolar targeting and vacuolar transport. 
It is not clear to us why VPSl forms strong associations with the other 
two. 



SRB9 
SSN3 
SSN8 



Pol n transcription factors. 



YDR060W 

N0P4 

YDL213C 

ELAl 

MAK5 



MAK5, N0P4 and YDR060W are involved in rRNA processing and 
ribosomal large subunit assembly and maintenance. ELAl is a 
transcription elongation factor involved in RNA elongation from Pol II 
promoter. The link of it with the other three is not clear to us. 
Unannotated YDL213C (NOP6) is strongly suspected to have similar 
function. 



RPFl 
DNL4 



Both RPFl and DNL4 form strong associations with other proteins 
involved in ribosomal large subunit assembly and maintenance. DNL4 
is currently annotated as a DNA ligase that is active in double strand 
break repair via non-homogeneous end joining. 



GARl 
CBF5 



35S primary transcript processing in small nuclear ribo-nucleoprotein 
complex. 



YEL015W 
DCP2 



DCP2 is linked with deadenylation-dependent decapping and mRNA 
catabolism. Therefore, YEL015W (DPC3) is suspected to be involved 
in similar functions. They both share many common partners with 
LSM proteins of small nuclear ribo-nucleoprotein complex. 



SNZ3 
SNZ2 
SNZl 



They all belong to stationary phase-induced gene family involved in 
pyridoxine metabohsm. 



- SHGl 
YDR469W 

BRE2 

SETl 

SPPl 

- SWDl 

- SWD3 



Complex involved in chromatin silencing at telomere and histone 
methylation. 



TAF4n 

T APTI Q 
1 A r 1 H 


TFIID complex active with transcription initiation from Pol n 
promoter. 




Involved in DNA-replication initiation. 








ARP2 

ARP3 

1 ARC 15 

' ARC18 

ARC35 
ARC 19 
ARC40 


Arp2/3 complex involved in cell growth and maintenance. 


CKSl 

CDC28 


CDC28-complex involved in the cell-cycle. 


YRAl 
YKL214C 


They are both yeast RNA Annealing proteins. YRAl is hnked with 
mRNA processing. Therefore, we suspect YKL214C (YRA2) has 
similar function. 


VPS8 

VPS41 


Both involved in homotypic vacuole fusion (non-autophagic) vacuole 
organization and biogenesis. Since they are both viable, double 
deletion experiment can be tried. 


SSBl 

SSAl 


They are both heat-shock proteins involved as chaperone helping 
protein folding. SSAl is also involved in protein transport between 
nucleus and cytoplasm. It is grouped with SRMl complex in cluster- 
1069. 




YFR024C 

' YSC84 

' SLAl 


YSC84 and SLAl are involved in actin filament organization. 
Therefore, unannotated YFR024C and BZZl are strongly suspected to 
be involved in similar functions. 
















H0S2 

' YIL112W 

SNTl 
SIF2 
HSTl 
SETS 


Parts of histone deacetylase complex. 



ERV25 
EMP24 
SEC16 



Parts of COPn-coated vesicle complex linked with ER to Golgi 
transport. 



BOIl 
BOI2 



Involved in rho-protein signal transduction and establishment of cell 
polarity. 



ATP17 

ATP5 

ATPl 

ATP2 

ATP18 

ATP6 

ATP7 



Proton-transporting ATP synthase complex. 



TFGl 



TFG2 



Transcription factor TFIIF large subunit. 



LCB2 
SEC7 



SEC7 is a ARF guanyl-nucleotide exchange factor involved in protein 
transport, whereas LCB2 is a serine palmitoyltransferase involved in 
sphingolipid biosyntheis. Link is not clear to us. 



TRS120 
TRS130 



TRAPP involved in ER to Golgi transport. 



RPB7 

RPB3 

RPB2 

RP021 

RPB4 



DNA-directed RNA polymerase II, core. 



PPH21 
PPH22 
TPD3 



Protein phosphatase type 2A complex. PPH21, 22 are ancient 
paralogs. 



YBL049W 
YCL039W 



Two unannotated proteins that form associations with each other. 
From their other associations, we suspect that they may be involved in 
vacuolar transport. Since they are both viable, double deletion 
experiment will possibly link them and identify their function. 













RETS 
' SEC28 

SEC26 
' SEC21 

COPl 

' SEC27 

Rbl2 

BETl 

' BOSl 

SEC22 


COPI or COPII vesicle coats involved in ER to Golgi transport or 
retrograde transport. 






VMA2 
' VMA8 

STVl 
' VPHl 


Hydrogen transporting ATPase. 










V iVl A / 






PRP21 
' PRP9 

YMR288W 
' RSEl 

LEAl 


U2 snRNA binding involved in mRNA splicing. Unannotated 
1 iviKZoo w (^riotii J J ) IS Strongly suspeciea lo oe invoivea in ine 
same process. 












SRB7 

SRB2 

SRB4 


Suppressor of RNA polymerase n, possible component of the 
holoenzyme. They share many partners with members of the mediator 
complex. 






YGR268C 

YOR284W 

YPR171W 


They are unclassified proteins strongly suspected to be involved with 

actin patch/filament assembly aiding in cytokinesis. They also share 
many partners with members of the ACF2-complex discussed below. 




SDS3 

UMEl 


UMEl is a transcription factor. SDS3 is part of histone deacetylase 
complex involved in transcriptional gene silencing. 


SMC4 

NTTM1 


SMC4 is involved in mitotic chromosome condensation. NUMl is 
involved in polymerization and stabihzation of microtubules. 



PEX17 
PF,X13 


Peroxisome organization and biogenesis. 


ARG80 

MCMl 


ARG80 is a transcription factor regulating genes ARG8 1 and ARG82. 
MCMl is also a transcription factor. 








YDR449C 
' MPPIO 

YLR409C 
' YDR324C 


NANl, YDR324C, YDR449C and MPPIO are linked with snoRNA 
binding and processing of 20S pre-rRNA. Therefore, the other un- 
annotated are linked with the same function. 




IN AIM 

YKR060W 


SEN15 

HRR25 


SEN15 is a tRNA-intron endonuclease complex involved in tRNA 
splicing. HRR25 is a casein kinase linked with several biological 
processes. The link between two is not clear. HRR25 also shares many 
partners with TEMl. 




RPD3 

' SINS 

' YDL076C 


RPD3 and SIN3 are part of histone deacytalase complex involved in 
chromatin silencing. Therefore, we strongly suspect YDL076C 
(RXT3) to be involved in the same function. 




STIl 

rPR6 


Both proteins are involved in protein folding. 


BNRl 

RNTl 


Both proteins regulate actin cytoskeleton. Double deletions are 
temperature sensitive and show deficiency in bud emergence. 


ISTl 

GCDll 


GCDl 1 is a translation initiation factor. Therefore, we strongly suspect 
unannotated ISTl to be involved in similar process. It forms 
associations with many other proteins acting as translation initiation 
factors. 






CDCIO 
' CDC12 

CDC3 
' CDCll 


They all are involved in proper bud growth. Only CDCIO is viable. 
They share partners with proteins of KCC4 group involved in bud ring 
formation. 


KAP120 

SSA2 


KAP120 is structural constituent of nuclear pore, whereas SSA2 is 
involved in chaperoning as well as SRP-dependent, membrane 
targeting, translocation. Double deletion should be studied. They also 
share partners with other nuclear pore complex containing NUP133 as 
well as transport proteins involving MTRIO. 







GCD2 
' SUI3 

GCD6 
' GCDl 

GCN3 

SUI2 
' GCD7 


Eukaryotic translation initiation factor 2 (eIF2) complex. 




1 YGR089W 

NNFl 


NNFl is involved in chrosome segregration (spindle pole) and mitosis. 
Therefore, we suspect un-annotated YGR089W (NNF2) is involved in 
similar function. 




YPT53 
VPS21 


Both are GTPase proteins linked with endocytosis in late endosome. 
Double deletion experiment is recommended. 




RFC5 

' RFC4 

RFC3 

RFC2 


DNA replication factor C complex. RFC3 and RFC4 are ancient 
paralogs. 


VPS 17 

VPS 19 


Both proteins are involved in vacuole to golgi or endosome to golgi 
transport. Double deletion recommended. 




SNOl 

YMR322C 


SNOl is involved in pyridoxine metabolism. Therefore it is Ukely 
YMR322C (SN04) is involved in same function. Double deletion is 
recommended. 




BMHl 

BMH2 


Both are RAS signaling proteins that activate MAPK. It is known that 
null mutants of individual genes are viable but double deletion is 
inviable. 




TFSl 

NSPl 


NSPl is a nuclear pore protein involved in transport. Unannotated 

TPS 1 is a lipid-binding protein. Therefore, it is possibly involved in 
transport between nucleus and cytoplasm using the NSPl pore. 





J PCL7 

PCL6 


They are both cyclin-dependent protein kinases acting as regulator 
proteins. Double deletion is recommended. 








NUP49 

' NUP57 

NUP42 

' NUPlOO 

' NUP116 

GSPl 
NUP60 

' NUP2 

' NUPl 


Nuclear pore proteins involved in NLS-bearing substrate-nucleus 
import, mRNA-binding (hnRNP) protein-nucleus import, mRNA- 
nucleus export, nuclear pore organization and biogenesis, protein- 
nucleus export, rRNA-nucleus export, ribosomal protein-nucleus 
import, snRNA-nucleus export, snRNP protein-nucleus import, tRNA- 
nucleus export. NUPlOO and NUPl 16 are ancient paralogs. 




HRTl 
YLR097C 


HRTl belongs to nuclear ubiquitin ligase complex involved in cell- 
cycle. Therefore, YLR097C (HRT3) is possibly linked with similar 
function. 




GICl 

' GIC2 

CLA4 


Active in rho-protein signal transduction pathway. Also forms 
associations with ZDSl, ZDS2 controlUng cell-polarity. GICl and 
GIC2 are ancient paralogs. 



YGR156W 

- RNA14 

- YTHl 

- YKL059C 

- CFT2 

- CFTl 

- YSHl 

- PAPl 

- REF2 

- PTAl 

- HPl 

- PFS2 

- SWD2 

- HCA4 

- GLC7 



They are parts of mRNA cleavage and poly-adenylation specificity 
factor complex. HCA4 is linked with 35S primary transcript 
processing. 



Involved in DNA repair. 



RFA3 
RAD59 



H0R2 
RHR2 
AAC3 



All metaboUc proteins. HOR2 and RHR2 are involved in glycerol 
metabolism and response to osmotic stress. AAC3 is involved in 
ATP/ADP exchange. 



RPA190 

RPA12 

RPA135 



DNA-directed RNA polymerase I complex. 



SAHl 



CYS4 



Both involved in metabolic process. 



EPLl 
ESAl 
ARP4 



Regulation of transcription from Pol 11 promoter as histone 
acetyltransferase complex. 



YHR197W 

YNL182C 

YLR106C 



They are linked with a large group of unannotated proteins linked with 
ribosomal large subunit assembly and maintenance. 



BMSl 
SIKl 


Both involved in snoRNA binding and 35S primary transcript 
processing. 






ELP4 
' ELP6 

ELP3 

' IKTl 

' ELP2 

IKI3 


Transcription elongation factor complex regulating of transcription 
from Pol n promoter. 


GCN2 

NHP? 


GCN2 is a protein kinase linked with protein amino acid 
phosphorylation, whereas NHP2 is associated with 35S primary 

transcript processing. 


TSMl 

TAF25 


TFllD complex, involved in general RNA polymerase II transcription 
factor. 








RRP6 

^ RRP4 

' RRP43 

RRP42 
' MTR3 

SKI6 

^ RRP45 

[_ DIS3 
^ RRP46 

CSL4 

SKI7 


They all form nuclear exosome (RNase complex) and are involved in 
35S primary transcript processing and mRNA catabolism. 


YGR128C 

DIP2 


Processing of 20S pre-rRNA. 


YGR215W 
YGT 1 9qr 


They are both structural constituents of ribosome involved in protein 
biosynthesis. Double deletion strongly recommended. 


YMR145C 

BGL2 


YMR145C is a NADH dehydrogenase involved in ethanol 
fermentation, whereas BGL2 is a glucan 1,3 beta-glucosidase involved 
in cell wall organization and biogenesis. Their Unk is not clear to us. 
Double deletion recommended. 





YPT32 

' YPTIO 

1 YPT7 

YPT31 

1 YPTl 

' SEC4 

YPT52 


They are involved in Golgi to vacuole transport or vesicle-mediated 
transport. 




HHTl 

HTBl 

HTAl 

HHFl 


They are parts of nucleosome complex involved in chromatin 
assembly and disassembly. 


STE20 

PARI 


STE20 is a protein kinase involved in cell cycle progression, whereas 
PARI is a protein kinase inhibitor involved in cell-cycle arrest. 


ACOl 

YGL245W 


ACOl is a aconitate hydratase involved in glutamate biosynthesis, 
whereas YGL245W is a glutamate-tRNA hgase. 


1 YGROlOW 

YLR328W 


Both involved in nicotinamide adenine dinucleotide metabohsm. 
Double deletion recommended. 


SNF4 

SNFl 


SNFl is a protein kinase linked with glucose metabolism, whereas 
SNF4 is a protein kinase activator involved in regulation of 
transcription from Pol 11 promoter. Possibly they are in the same 
pathway. 


DED81 

PDR13 


DED81 is a asparagine-tRNA ligase, and FDR 13 is involved in protein 
biosynthesis and chaperone. They are also linked with other proteins 
involved in glycine-tRNA aminoacylation (ERGIO and GRSl). 


FOL2 

PORl 


FOL2 is a GTP cyclohydrolase involved in folic acid and derivative 
biosynthesis, whereas PORl is involved in ion transport and aerobic 
respiration. 


1 SYSl 

YBR098W 


SYS 1 is involved in golgi to endosome transport and vescicle 
organization and biogenesis. It is not clear why YBR098W which also 
shares many partners with other proteins involved in cellular 
organization is classified as DNA repair protein in SGD. 



QCR2 
CORl 


They both belong to respiratory chain complex HI and involved in 
aerobic respiration. 




NOP12 

^ YPL012W 

YKL014C 


Ribosomal biogenesis and pre-rRNA processing. 


FUSS 

KSSl 


They are both MAP kinases involved in signal transduction of mating 
signals. Double deletion should be interesting. 


SPHl 

SPA2 


Involved in Rho protein signal transduction, actin filament 
organization, establishment of cell polarity, polar budding and 
pseudohyphal growth. 






KCC4 

' GIN4 

YDL225W 


YDL225W is involved in cytokinesis and formation of bud-ring. The 
remaining two are protein kinases active in axial budding, bud growth, 
protein amino acid phosphorylation, septin assembly, septum 
formation, septin checkpoint at the bud neck. KCC4 and GIN4 are 
ancient paralogs. 






SEC34 

' SED5 

SEC35 


Involved in ER to Golgi transport and intra-Golgi transport. 






VPS 16 

' PEP5 

VPS33 


Involved in Golgi to endosome transport, homotypic vacuole fusion 
(non-autophagic), late endosome to vacuole transport, protein- vacuolar 
targeting and vacuole organization and biogenesis. 




CLBl 
CLB3 


Cyclin-dependent protein kinase involved in mitotic induction. 






HSC82 

' HSP82 

SBAl 


HSC82 and HSP82 are heat shock proteins, ancient paralogs, whereas 
SBAl , a protein linked with chaperoning, is known to bind with 
HSP90 heat shock complex. 






1 AAD14 

' HPA3 

YIPS 


HP A3 - Histone acetylation, AAD14 - aldehyde metabohsm and 
YIP3-COPII-coated vesicle. Link unclear. Synthetic mutation 
recommended. Most of their other partners are involved in different 
metabohc functions. 



MY03 
MY05 
UBP7 



Both MY03 and MY05 are class I myosins involved in cytokinesis 
through transport of membrane bound components. Ancient paralogs 
originating from ancient gene duplication. Deletion of either of them 
has little effect on cell growth, but double deletion causes severe 
defects in growth and actin cytoskeleton organization. Link with 
UBP7 is not clear. 



upr / IS noi Clear. 

They are part of 20S core proteasome involved in ubiquitin dependent 
protein cataboUsm. 



PRE9 
PRE6 
PRE5 
PUPS 
PRE4 
SCLl 
PRE8 
PRE2 



VMA4 
VMM 
VMA13 



They are involved in vacuolar acidification. Double or triple mutation 
should be tried. 



THI4 
YNKl 



THI4 is involved in thiamin biosynthesis and DNA repair. Therefore, 
YNKl, a nucleoside diphosphase kinase is suspected to be linked in 
the same pathway. Double deletion study will clarify their hnk. 



LAS 17 

YNL094W 

YMR192W 



Actin filament assembly. YMR192W (APP2) is unannotated. 
YNL094W (APPl) is partly annotated linked with actin filament 
assembly. 



PDBl 
YDR430C 



PDBl is a pyurvate dehydrogenase involved in pyurvate metabolism. 
Therefore, yet unannotated YDR430C (CYMl), a cystohc 
metalloprotease is suspected to have similar function. Double deletion 
experiment is recommended. 



PRP19 

CLFl 

SYFl 

SYF2 

ISYl 

SNT309 

ECMl 



SpUcosome complex involved in mRNA splicing. 



PRBl 
YDR214W 



PRBl responds to starvation and needed for full protein degradation 
during sporulation. YDR214W (AHAl), a heat shock protein is 
suspected to have similar function in shock response. Double deletion 
study is recommended, (low confidence) 



They are all involved in DNA strand annealing and repair. 



MSH6 

RFAl 

RAD52 



Parts of snRNA complex involved in mRNA splicing. 



DCPl 
PATl 
PRP24 
LSM7 
LSM5 
LSM6 
LSM2 
LSM3 
LSM8 
LSM4 
LSMl 
-iCEMl 



COFl 
CPHl 



COFl is linked with actin filament depolymerization whereas CPHl is 
associated with histone deacetylase complex. Their other links (0YE2, 
CYRl, MY04 etc.) make us suspect that these two processes are 
hnked with each other. 



Signal-transducer and nucleus-cytoplasm transport. 



SRMl 
NTF2 



ECM31 
PROS 
CDDl 
YJL199C 



CDDl-cytidine deaminase, ECM31-pantothenate biosynthesis, 
PR03-proline biosynthesis, therefore, YJL199C (MBBl) is hkely to 
be involved in metabolic process. 



DRSl 

PUF6 

RLP7 

NOGl 

YHR052W 

YGR103W 

YTMl 

HASl 

YER006W 

DBPIO 

SDAl 

YLR074C 

YER126C 

SSFl 

YNLllOC 

YKR081C 

NOP2 

YMR049C 

CDC95 

YGLlllW 



Ribosomal large subunit assembly and maintenance. 



RP026 

RPB5 

RPB8 

RPBIO 

RPC19 



25kDa RNA-polymerase subunit common to all Pol I, n and n. 



VAM7 
YHR105W 



VAM7 is v-SNARE protein linked with Golgi to vacuole transport. 
Therefore, we strongly suspect that YHR105W (YPT35) is active in 
similar function. It also shares partners with YPTl and YPT32 
involved in similar functions. Double deletion experiment is 

recommended. 



Regulation of transcription from Pol 11 promoter. 



fflRl 
fflR2 



PET9 
YKR046C 



PET9 is linked with ATP/ADP exchange. Therefore, we suspect that 
YKR046C (PET 10) is active in similar function. 



MRPL4 

MRP7 

MRPL7 

MRPL28 

YML025C 

MRPL35 

MRPL3 

MRPL8 



They form mitochondrial large ribosomal subunit. 



RLRl 
GBP2 



GBP2 is involved in telomeric DNA binding, whereas GBP2 is 
suspected to play a role in transcription elongation by Pol II. Double 
mutation would clarify their closer link. They are also weakly linked 
with HPRl (DNA-dependent transcription) and MFTl (protein- 
mitochondrial targeting) proteins. 



MSS116 
YLR432W 



MSSl 16 is a RNA helicase hnked with RNA splicing. YLR432W 
(IMD3) is IMP dehydrogenase. However, their strong links with other 
proteins involved with RNA metobolism (such as NOP12) make us to 
suspect that they are both in that pathway. Double mutation of them 
may establish the point. 



TAF61 

ADRl 

GCN4 

NGGl 

ADA2 

TAF17 

TAF145 

TAF90 

TAF60 

SPT15 



TFIID complex and related regulators. 



MY02 



MY04 



They are both class V myosins involved in endocytosis. 



PRTl 

RPGl 

NIPl 

TIF34 

TIF35 

HCRl 

TIF5 



TIF34, TIF35, NIPl, PRTl, RPGl, TIF5 are involved in translation 
initiation as part of eIF3 complex. Unannotated protein HCRl is 
strongly suspected to be linked with the same process. 





GTTl 

YEL017W 


GTTl is linked with glutathione metabolism. Therefore, it is suspected 
that unannotated YEL017W (GTT3) is linked with similar metabolic 
purpose. 




RPC37 
' RPC53 

RPC82 
' RETl 

RPC31 


Parts of Pol m complex. 




YPL004C 
' YGR086C 

OYE2 
' CAR2 


CAR2 is involved in amino-acid metabolism. 0YE2 is also a NADPH 
dehydrogenase. Therefore, it is likely that other two yet unannoted 
proteins are also linked with similar metabolic purposes. Since they 
are all individually viable, multiple deletion experiments may reveal 
their functional link. 






RSC8 

' ISWl 

ITCl 


RCS8 and ISWl are involved in chromatin modeling. ITCl is protein 
with unknown function. 


CDC4 

GRRl 


Parts of ubiquitin ligase complex involved in Gl/S transition of 
mitotic cell cycle. 






VPS35 

' VPS5 

PEP8 


Golgi retention or retrograde transport. Forms associations with 
proteins in VPS 17 module. 






ABFl 

' CHDl 

FKHl 


CHDl is a Pol n transcription elongation factor. FKHl and ABFl are 
transcrition factors related to chromatin silencing at HML and HMR. 






DPB2 

' POL2 

DPB4 


Parts of epsilon DNA polymerase complex, involved in DNA 
mismatch repair and strand elongation. 



MRPl 

YNL306W 

RSMIO 

NAM9 

RSM22 

RSM25 

MRPS9 

MRP13 

MRP51 



Mitochondrial small ribosomal subunit. 



YPR144C 

IMPS 

YDL148C 

YLR186W 

YJL109C 

KRE33 

NOPl 

YBL004W 

ECM16 



They are involved in snoRNA binding, 35S primary transcript 
processing, processing of 20S pre-rRNA, rRNA modification. 



BEMl 

CDC24 

CDC42 



Signaling proteins involved in establishment of cell polarity, bud 
growth and shmooing. 



SMX2 

SMX3 

PRP8 

SMDl 

CUSl 

SMEl 



Involved in mRNA splicing in small nuclear ribo-nucleoprotein 
complex. 



YPL246C 

KTR3 

YJL151C 

YGL104C 

YKR030W 



KTR3 is a mannosyltransferase involved in cell- wall synthesis and 
biogenesis. The whole complex either has similar function or protein- 
vacuolar targeting. 



CDC73 
' PAFl 

^ LEOl 

^ RTFl 

1 QPT< 

or 1 J 


CDC73/PAF1 complex involved as transcription elongation factor 
from Pol n promoter. 


TIM54 


Both of them are protein transporters involved in mitochondrial 
translocation. 


MHRl 

I JJKl iOL- 


Both proteins have been located with mitochondrion. MHRl is a 
transcription regulator involved in mitochondrial genome 
maintenance, whereas YDR116C is part of mitochondrial large 
ribosomal subunit. 


1 MUD2 

MSL5 


Both are involved in mRNA-splicing. 


YGL099W 

YDRIOIC 


Both unknown. Based on our study, we suspect these and following 
other proteins are involved in processing of 27S pre-rRNA ribosomal 
subunit. They are NOGl, YGR103W, HASl, CDC95, RLP7, 
YKR081C, YHR052W, YMR049C, YTMl, NOP2, YDRIOIC, 
YOR206W, YNLllOC. 


RTSl 

1 X VJJN. 1 U 1 \^ 


RTSl is part of protein phosphatase 2A complex. Therefore, it is 
possible that YGR161C (RTS3) is also involved in similar function. 


ACF2 

YJR083C 


ACF2 is involved in actin cytoskeleton organization and biogenesis. 
Yet unarmotated YJR083C (ACF4) is strongly suspected to be linked 
to the same process. 



MEDll 

MEDl 

RGRl 

MED6 

MED8 

MED4 

CSE2 

NUTl 

SRB8 

SRB5 

SIN4 

PGDl 



Mediator complex for transcription from Pol n promoter. 



CDHl 



CDC20 



Anaphase-promoting complex involved in cyclin cataboUsm, mitotic 
chromosome segregation and metaphase/anaphase transition. 



BBPl 
NIP29 



Structural constituent of cytoskeleton present in spindle pole body and 
involved in microtubule nucleation. 



SWI3 
SNF2 
SNF5 



Nucleosome remodeling complex involved in chromatin modeling. 
Should be probed for double and triple deletion for better 
understanding. 



GSYl 
GSY2 



Glycogen metabolism. 



SNU56 

STOl 

YHCl 

SNU71 

LUC7 

NAM8 

CBC2 

MUDl 

SNPl 

PRP39 

PRP40 

PRP42 

SMD2 

SMD3 



Commitment complex and snRNP Ul. 



BRXl 

YOR206W 

FPR4 



All unannotated proteins possibly involved in biogenesis and transport 
of ribosome. 



PRIl 



PRI2 



Parts of alpha DNA polymerase-primase complex involved in DNA 
replication initiation. 



DUTl 

ECU 

HPA2 



Involved in metabolism. 



ERGIO 
ACS2 



Both involved in acetyl-CoA biosynthesis. 



DIMl 
YOR145C 



DIMl is involved in 35S primary transcript processing and rRNA 
modification. Therefore, it is suspected that YOR145C (DIM2) is 
involved in similar process. 



SODl 
UBAl 



SODl is related to copper homeostasis, whereas UBAl is hnked with 
ubiquitin cycle. 





SEC24 

' SEC23 

SARI 


COPn complex. 




HHl 

' TRAl 

GCN5 

1 

SPT20 


SAGA complex linked with chromatin modeling and histone 
acetylation. Only TRAl is linked with TRAPP complex. They all take 
role in transcription control through chromatin modeling. 


IPPI 

MDH1 


MDHl is a malic enzyme linked with tricarboxylic acid cycle. 
Therefore, unannotated IPPI, an inorganic diphosphatase is expected 
to have similar function. 


SAP190 


Involved in GI/S transition of cell cycle. Double mutant grows slowly. 
Triple mutant with SAP 1 55 is inviable. 


MKKl 
MKK2 


Both are map kinase kinase. Single mutant is viable but double 
mutants show some defects. 


YNL041C 
YPR105C 


Parts of Golgi-transport complex and involved in intra-golgi transport. 


UFD2 

CDC48 


Both involved in ubiquitin-dependent protein cataboHsm. 


1 YAP6 

STDl 


YAP6 is a transcription factor, whereas STDl is involved in 
signal transduction and regulation of transcription from 
Pol n promoter. Double mutant should be studied. 



CAF130 

SIGl 

NOT5 

CAF40 

CDC39 

CCR4 

POP2 

CDC36 

NOT3 



CCR4-NOT complex regulating transcription from Pol II promoter 
and active in poly-A tail shortening. POP2 is required for glucose 
derepression. 



EFDl 
CTF8 
POL30 



CTF8 and POL30 are involved in DNA replication and repair. 
It is likely that EFDl (YOR144C) is active in same process. 



CDC5 
RAD53 



They are both protein threonine/tyrosine kinases involved in DNA 
repair and replcation. 



MNNIO 
ANPl 



Parts of maimosyltransferase complex. 



NRDl 
NAB3 
YML117W 



NRDl is involved in nuclear RNA binding, whereas NABS is involved 
in poly-A binding. Therefore, YMLl 17W (NAB6), an unaimotated 
protein is possibly involved in the same function. 



STE7 

STEll 

STE5 



MAP kinase proteins involved in the signal transduction of mating 
signal. 



- MTRIO 
-CRMl 

PABl 
■ MSN5 
KAP123 
PSEl 



All the proteins except PABl are involved in transport of proteins and 
mRN As between nucleus and cytoplasm. Only P AB 1 is linked with 
regulation of translation initiation. The reason why PABl got 
associated here with transport proteins is not clear to us. 



CDC40 
PRP43 
YGR278W 
YDL209C 



Both CDC40 and PRP43 are involved in the spUcesome complex and 
functions as pre-mRNA splicing factors. Therefore, YDL209C and 
YGR278W are suspected to have similar functions. 



NYVl 

VAM3 

YKT6 

VTIl 

SEC17 



They are all part of SNARE complex and involved in transport 
between Golgi and other vesicles as well as non-selective vesicle 
fusions. 



GCNl 



ECM29 



In our analysis, ECM29 shares many pairs with the proteasome 
complex member proteins. It is linked with cell-wall organization and 
biogenesis, whereas GCNl is linked with regulation of translational 
elongation. Their link is not clear to us. 



All these proteins are parts of proteasome complex linked with 
proteolysis and peptolysis. 



YGR232W 

— RPNl 

— UBP6 

— RPN12 



— YGL004C 

— RPTl 

— RPN6 

— RPN5 

— RPT3 

— PREl 

— RPNl 

— RPNl 

— RPT2 
RPN3 

— RPN8 

— RPN9 

— RPT6 
RPN7 
RPT4 
RPT5 

YLR421C 







SMC2 

1 SMCl 

' NUF2 

SMC3 

YDL074C 


SMC2, SMCl, NUF2 and SMC3 are involved in chromosome 
condensation and segregation processes. It strongly suggests that 
YDL074C (BREl) is involved in the similar function. 






WTMl 

1 SOFl 

' CAF4 

CDC55 

' STE4 


STE4 is a G-protein GTPase active in signaling during mating. CDC55 

is a IJLXJldll L;lHJ&L;lla.La.?>C. v i\i t-, VV J. iVl 1 cliC OLllCi iCHUld-HJiy L7HJLC111&. 

SOFl is part of small nucleolar ribo-nucleoprotein complex. Their link 
is not clear to us. 



